18 research outputs found

    Enabling policy making processes by unifying and reconciling corporate names in public procurement data. The CORFU technique

    Get PDF
    This paper introduces the design, implementation and evaluation of the CORFU technique to deal with corporate name ambiguities and heterogeneities in the context of public procurement meta-data. This technique is applied to the "PublicSpending.ner initiative to show how the unification of corporate names is the cornerstone to provide a visualization service that can serve policy-makers to detect and prevent upcoming necessities. Furthermore, a research study to evaluate the precision, recall and robustness of the proposed technique is conducted using more than 40 million of names extracted from public procurement datasets (Australia, United States and United Kingdom) and the CrocTail projec

    SKYWare: The Unavoidable Convergence of Software towards Runnable Knowledge

    Get PDF
    There Has Been A Growing Awareness Of Deep Relations Between Software And Knowledge. Software, From An Efficiency Oriented Way To Program Computing Machines, Gradually Converged To Human Oriented Runnable Knowledge. Apparently This Has Happened Unintentionally, But Knowledge Is Not Incidental To Software. The Basic Thesis: Runnable Knowledge Is The Essence Of Abstract Software. A Knowledge Distillation Procedure Is Offered As A Constructive Feasibility Proof Of The Thesis. A Formal Basis Is Given For These Notions. Runnable Knowledge Is Substantiated In The Association Of Semantic Structural Models (Like Ontologies) With Formal Behavioral Models (Like Uml Statecharts). Meaning Functions Are Defined For Ontologies In Terms Of Concept Densities. Examples Are Provided To Concretely Clarify The Meaning And Implications Of Knowledge Runnability. The Paper Concludes With The Runnable Knowledge Convergence Point: Skyware, A New Term Designating The Domain In Which Content Meaning Is Completely Independent Of Any Underlying Machine

    Towards a method to quantitatively measure toolchain interoperability in the engineering lifecycle: A case study of digital hardware design

    Get PDF
    The engineering lifecycle of cyber-physical systems is becoming more challenging than ever. Multiple engineering disciplines must be orchestrated to produce both a virtual and physical version of the system. Each engineering discipline makes use of their own methods and tools generating different types of work products that must be consistently linked together and reused throughout the lifecycle. Requirements, logical/descriptive and physical/analytical models, 3D designs, test case descriptions, product lines, ontologies, evidence argumentations, and many other work products are continuously being produced and integrated to implement the technical engineering and technical management processes established in standards such as the ISO/IEC/IEEE 15288:2015 "Systems and software engineering-System life cycle processes". Toolchains are then created as a set of collaborative tools to provide an executable version of the required technical processes. In this engineering environment, there is a need for technical interoperability enabling tools to easily exchange data and invoke operations among them under different protocols, formats, and schemas. However, this automation of tasks and lifecycle processes does not come free of charge. Although enterprise integration patterns, shared and standardized data schemas and business process management tools are being used to implement toolchains, the reality shows that in many cases, the integration of tools within a toolchain is implemented through point-to-point connectors or applying some architectural style such as a communication bus to ease data exchange and to invoke operations. In this context, the ability to measure the current and expected degree of interoperability becomes relevant: 1) to understand the implications of defining a toolchain (need of different protocols, formats, schemas and tool interconnections) and 2) to measure the effort to implement the desired toolchain. To improve the management of the engineering lifecycle, a method is defined: 1) to measure the degree of interoperability within a technical engineering process implemented with a toolchain and 2) to estimate the effort to transition from an existing toolchain to another. A case study in the field of digital hardware design comprising 6 different technical engineering processes and 7 domain engineering tools is conducted to demonstrate and validate the proposed method.The work leading to these results has received funding from the H2020-ECSEL Joint Undertaking (JU) under grant agreement No 826452-“Arrowhead Tools for Engineering of Digitalisation Solutions” and from specific national programs and/or funding authorities. Funding for APC: Universidad Carlos III de Madrid (Read & Publish Agreement CRUE-CSIC 2023)

    An analysis of safety evidence management with the Structured Assurance Case Metamodel

    Get PDF
    SACM (Structured Assurance Case Metamodel) it a standard for assurance case specification and exchange. It consists of an argumentation metamodel and an evidence metamodel for justifying that a system satisfies certain requirements. For assurance of safety-critical systems, SACM can be used to manage safety evidence and to specify safety cases. The standard is a promising initiative towards harmonizing and improving system assurance practices, but its suitability for safety evidence management needs to be further studied. To this end, this paper studies how SACM 1.1 supports this activity according to requirements from industry and from prior work. We have analysed the notion of evidence in SACM, its evidence lifecycle, the classes and associations of the evidence metamodel, and the link of this metamodel with the argumentation one. As a result, we have identified several improvement opportunities and extension possibilities in SACM

    Semantic recovery of traceability links between system artifacts

    Get PDF
    This paper introduces a mechanism to recover traceability links between the requirements and logical models in the context of critical systems development. Currently, lifecycle processes are covered by a good number of tools that are used to generate different types of artifacts. One of the cornerstone capabilities in the development of critical systems lies in the possibility of automatically recovery traceability links between system artifacts generated in different lifecycle stages. To do so, it is necessary to establish to what extent two or more of these work products are similar, dependent or should be explicitly linked together. However, the different types of artifacts and their internal representation depict a major challenge to unify how system artifacts are represented and, then, linked together. That is why, in this work, a concept-based representation is introduced to provide a semantic and unified description of any system artifact. Furthermore, a traceability function is defined and implemented to exploit this new semantic representation and to support the recovery of traceability links between different types of system artifacts. In order to evaluate the traceability function, a case study in the railway domain is conducted to compare the precision and recall of recovery traceability links between text-based requirements and logical model elements. As the main outcome of this work, the use of a concept-based paradigm to represent that system artifacts are demonstrated as a building block to automatically recover traceability links within the development lifecycle of critical systems.The research leading to these results has received funding from the H2020 ECSEL Joint Undertaking (JU) under Grant Agreement No. 826452 \Arrowhead Tools for Engineering of Digitalisation Solutions" and from speci¯c national programs and/or funding authorities

    Enabling system artefact exchange and selection through a linked data layer

    Get PDF
    The use of different techniques and tools is a common practice to cover all stages in the systems development lifecycle, generating a very good number of system artefacts. Moreover, these artefacts are commonly encoded in different formats and can only be accessed, in most cases, through proprietary and non-standard protocols. This scenario can be considered a real nightmare for software or systems reuse. Possible solutions imply the creation of a real collaborative development environment where tools can exchange and share data, information and knowledge. In this context, the OSLC (Open Services for Lifecycle Collaboration) initiative pursues the creation of public specifications (data shapes) to exchange any artefact generated during the development lifecycle, by applying the principles of the Linked Data initiative. In this paper, the authors present a solution to provide a real multi-format system artefact reuse by means of an OSLC-based specification to share and exchange any artefact under the principles of the Linked Data initiative. Finally, two experiments are conducted to demonstrate the advantages of enabling an input/output interface based on an OSLC implementation on top of an existing commercial tool (the Knowledge Manager). Thus, it is possible to enhance the representation and retrieval capabilities of system artefacts by considering the whole underlying knowledge graph generated by the different system artefacts and their relationships. After performing 45 different queries over logical and physical models stored in Papyrus, IBM Rhapsody and Simulink, results of precision and recall are promising showing average values between 70-80%.The research leading to these results has received funding from the AMASS project (H2020-ECSEL grant agreement no 692474; Spain's MINECO ref. PCIN-2015-262) and the CRYSTAL project (ARTEMIS FP7-CRitical sYSTem engineering AcceLeration project no 332830-CRYSTAL and the Spanish Ministry of Industry)

    Características textuales como medida cualitativa de la Información en la generación semiautomática de tesauros

    Get PDF
    El objetivo del GTI es la generación semiautomática de tesauros mediante el análisis de un corpus. Tras ensayar distintos métodos de clasificación de la información, desde co-ocurrencia de términos a redes neuronales, se mostró necesaria la creación de nuevos indicadores que aportasen información adicional a la ya suministrada por el tesauro. La presentación de estos indicadores, y su previsible potencial, es la meta de la presente comunicación. El objetivo es reaprovechar el gran volumen de datos necesarios para realizar la clasificación y emplearlos en dos campos distintos: por un lado la validación del tesauro y por otro la creación de indicadores que nos indiquen a-priori la creatividad del texto dentro de nuestro corpus. La estructuración y etiquetado previo del texto parecen en estas circunstancias un paso necesario para poder estudiar posteriormente el resultado del conjunto de parámetros medidos en el set de documentos. La novedad se estudia desde un enfoque multidimensional: análisis lingüístico y del formato de los textos, estudio del tesauro generado, y la creación de indicadores ad-hoc. Al tiempo, se miden distintos parámetros en el tesauro para validar el tesauro autogenerado. Para el análisis matemático de los datos, se usan análisis multivariante y de las componentes principales. Una evaluación del programa está actualmente en curso

    Generación automática de tesauros: propuesta de un método lingüístico - estadístico

    Get PDF
    En este trabajo se presenta la investigación realizada durante los dos últimos años dentro del proyecto Generación Automática de Tesauros Orientada a la Arquitectura de Componentes. En este trabajo se han desarrollado varios métodos para construir semiautomáticamente un tesauro de descriptores. Para ello se han intentado automatizar, en muchas ocasiones con éxito, todas las fases para esta construcción automática, poniendo especial énfasis en las fases de adquisición y organización del conocimiento

    Nuevos patrones en la representación y la visualización de la información para entornos distribuidos: del tesauro al topic map

    Get PDF
    El surgimiento de una nueva realidad informativa, con la aparición de recursos de información en formatos electrónicos, ha hecho necesario acudir de manera sustantiva a los lenguajes documentales, y de forma especial a los tesauros, como herramientas para la indización y recuperación tanto de documentos textuales como de software y herramientas de programación. No obstante, por parte de los documentalistas se ha hecho necesario repensar los tesauros para adaptarlos no sólo a la recuperación de los nuevos objetos informativos, sino a las nuevas formas de acceso y a las nuevas capacidades de navegación del usuario a través del hipertexto. Se describe y valora el estado de la cuestión en la evolución de los tesauros para adaptarse a esta nueva realidad, con especial incidencia en los mapas conceptuales y los topic maps como entornos dinámicos mejor adecuados a una recuperación más semántica y contextual dentro de entornos de información distribuid

    Estudio de algunos aspectos contextuales del discurso médico mediante un módulo automático para el tratamiento documental

    Get PDF
    Durante los últimos años, la mejora de las herramientas informáticas ha supuesto un notable incremento en la eficiencia de las herramientas documentales. Elresultado ha sido un considerable aumento de los documentos electrónicos a texto completo, lo cual conlleva una mayor necesidad de disminuir el ruido en la recuperación documental. Para solucionar el problema, una de las soluciones que está experimentando mayor auge, tanto desde una perspectiva documental como lingüística, es la exploración del contexto mediante el análisis del discurso. En el presente estudio se ha desarrollado una herramienta que con una aproximación multidimensional permite caracterizar determinados discursos dependiendo de las variaciones diafásicas y diamésicas del lenguaje. Los aspectos analizados están centrados en aspectos estilísticos, tipológicos y temáticos del discurso escrito. La herramienta documental desarrollada engloba algoritmos de filtrado y clasificación automática. Así mismo, el vocabulario del MeSH ha sido implementado como herramienta de comparación. El análisis ha sido completado mediante un análisis estadístico multivariante. Se han obtenido diferencias significativas entre los distintos aspectos estudiados, lo cual aconseja el uso de estas aproximaciones para la mejora de las herramientas de documentación automatizada.Los autores quieren hacer patente que el presente estudio ha sido financiado por la Consejería de Educación de la Comunidad Autónoma de Madrid, dentro del proyecto titulado “Aplicación de técnicas informétricas a la construcción automática de tesauros”
    corecore